45 research outputs found

    LLMs and Finetuning: Benchmarking cross-domain performance for hate speech detection

    Full text link
    This paper compares different pre-trained and fine-tuned large language models (LLMs) for hate speech detection. Our research underscores challenges in LLMs' cross-domain validity and overfitting risks. Through evaluations, we highlight the need for fine-tuned models that grasp the nuances of hate speech through greater label heterogeneity. We conclude with a vision for the future of hate speech detection, emphasizing cross-domain generalizability and appropriate benchmarking practices.Comment: 9 pages, 3 figures, 4 table

    It Takes Two to Negotiate: Modeling Social Exchange in Online Multiplayer Games

    Full text link
    Online games are dynamic environments where players interact with each other, which offers a rich setting for understanding how players negotiate their way through the game to an ultimate victory. This work studies online player interactions during the turn-based strategy game, Diplomacy. We annotated a dataset of over 10,000 chat messages for different negotiation strategies and empirically examined their importance in predicting long- and short-term game outcomes. Although negotiation strategies can be predicted reasonably accurately through the linguistic modeling of the chat messages, more is needed for predicting short-term outcomes such as trustworthiness. On the other hand, they are essential in graph-aware reinforcement learning approaches to predict long-term outcomes, such as a player's success, based on their prior negotiation history. We close with a discussion of the implications and impact of our work. The dataset is available at https://github.com/kj2013/claff-diplomacy.Comment: 28 pages, 11 figures. Accepted to CSCW '24 and forthcoming the Proceedings of ACM HCI '2

    Social Media and Electoral Predictions: A Meta-Analytic Review

    Get PDF
    Can social media data be used to make reasonably accurate estimates of electoral outcomes? We conducted a meta-analytic review to examine the predictive performance of different features of social media posts and different methods in predicting political elections: (1) content features; and (2) structural features. Across 45 published studies, we find significant variance in the quality of predictions, which on average still lag behind those in traditional survey research. More specifically, our findings that machine learning-based approaches generally outperform lexicon-based analyses, while combining structural and content features yields most accurate predictions

    Understanding and Measuring Psychological Stress using Social Media

    Full text link
    A body of literature has demonstrated that users' mental health conditions, such as depression and anxiety, can be predicted from their social media language. There is still a gap in the scientific understanding of how psychological stress is expressed on social media. Stress is one of the primary underlying causes and correlates of chronic physical illnesses and mental health conditions. In this paper, we explore the language of psychological stress with a dataset of 601 social media users, who answered the Perceived Stress Scale questionnaire and also consented to share their Facebook and Twitter data. Firstly, we find that stressed users post about exhaustion, losing control, increased self-focus and physical pain as compared to posts about breakfast, family-time, and travel by users who are not stressed. Secondly, we find that Facebook language is more predictive of stress than Twitter language. Thirdly, we demonstrate how the language based models thus developed can be adapted and be scaled to measure county-level trends. Since county-level language is easily available on Twitter using the Streaming API, we explore multiple domain adaptation algorithms to adapt user-level Facebook models to Twitter language. We find that domain-adapted and scaled social media-based measurements of stress outperform sociodemographic variables (age, gender, race, education, and income), against ground-truth survey-based stress measurements, both at the user- and the county-level in the U.S. Twitter language that scores higher in stress is also predictive of poorer health, less access to facilities and lower socioeconomic status in counties. We conclude with a discussion of the implications of using social media as a new tool for monitoring stress levels of both individuals and counties.Comment: Accepted for publication in the proceedings of ICWSM 201

    Literature review writing: a study of information selection from cited papers / Kokil Jaidka, Christopher Khoo and Jin-Cheon Na

    Get PDF
    This paper reports the results of a small study of how researchers select and edit research information from cited papers to include in a literature review. This is part of a bigger content analysis and linguistic analysis of literature reviews. This study aims to answer the following questions: where do authors select information from the cited papers (e.g., Abstract, Introduction, Conclusion section, etc.)? What types of information do they select (e.g., research objectives, results, etc.), and How do they transform that information (e.g., paraphrasing, cut-pasting, etc.)? In order to answer these questions, we analyzed the literature review section of 20 articles from the Journal of the American Society for Information Science & Technology, 2001-2008, to answer these questions. Referencing sentences were mapped to source papers to determine their origin. Other features of the source information were also annotated, such as the type of information selected and the types of editing changes made to it before including into the literature review. Preliminary results indicate that authors prefer to select information from the Abstract, Introduction and Conclusion sections of the cited papers. This information is transformed through cut-paste, paraphrase or higher-level semantic transformations to describe the research objective, methodology and results of the referenced study. The choices made in selecting and transforming the source information appeared to be related to the two styles of literature review finally constructed – integrative and descriptive literature reviews. Keywords: Literature reviews; Multi-document summarization; Information science; Information extraction; Information selection

    Predicting Sentence-Level Factuality of News and Bias of Media Outlets

    Full text link
    Predicting the factuality of news reporting and bias of media outlets is surely relevant for automated news credibility and fact-checking. While prior work has focused on the veracity of news, we propose a fine-grained reliability analysis of the entire media. Specifically, we study the prediction of sentence-level factuality of news reporting and bias of media outlets, which may explain more accurately the overall reliability of the entire source. We first manually produced a large sentence-level dataset, titled "FactNews", composed of 6,191 sentences expertly annotated according to factuality and media bias definitions from AllSides. As a result, baseline models for sentence-level factuality prediction were presented by fine-tuning BERT. Finally, due to the severity of fake news and political polarization in Brazil, both dataset and baseline were proposed for Portuguese. However, our approach may be applied to any other language

    Just Another Day on Twitter: A Complete 24 Hours of Twitter Data

    Full text link
    At the end of October 2022, Elon Musk concluded his acquisition of Twitter. In the weeks and months before that, several questions were publicly discussed that were not only of interest to the platform's future buyers, but also of high relevance to the Computational Social Science research community. For example, how many active users does the platform have? What percentage of accounts on the site are bots? And, what are the dominating topics and sub-topical spheres on the platform? In a globally coordinated effort of 80 scholars to shed light on these questions, and to offer a dataset that will equip other researchers to do the same, we have collected all 375 million tweets published within a 24-hour time period starting on September 21, 2022. To the best of our knowledge, this is the first complete 24-hour Twitter dataset that is available for the research community. With it, the present work aims to accomplish two goals. First, we seek to answer the aforementioned questions and provide descriptive metrics about Twitter that can serve as references for other researchers. Second, we create a baseline dataset for future research that can be used to study the potential impact of the platform's ownership change
    corecore